- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 10.9k
[custom_op][vllm-plugin] update custom_op class to use op_registry #19164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| 👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run  Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add  🚀 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @xuechendi, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
Hello team, gemini-code-assist here to provide a summary of this pull request. This PR addresses an issue where the existing CustomOp implementation didn't fully leverage the op_registry for initialization, making it difficult for plugins to properly override default operator implementations without resorting to monkey patching. The core idea is to modify the CustomOp class so that when it's instantiated, it looks up the actual implementation class in the op_registry and instantiates that instead. Additionally, the @CustomOp.register decorator is enhanced to allow plugin-provided custom ops to explicitly override existing registered ops, providing a cleaner mechanism for plugins to inject their own implementations.
Highlights
- Custom Op Instantiation Logic: A __new__method is added to the baseCustomOpclass. This method intercepts the class creation process, checks theop_registryfor a class registered under the same name, and delegates the actual object creation to the registered class. This ensures that when code tries to instantiateCustomOp(or a subclass registered with a specific name), it gets the correct, potentially overridden, implementation from the registry.
- Enhanced Registration for Overrides: The @CustomOp.registerdecorator is updated to accept an optionalcustom_op=Trueargument. When this flag is set and a class with the given name is already registered, the decorator will now replace the existing entry in theop_registrywith the new class being registered. This provides a clear and intended way for plugins to override default operator implementations.
Changelog
- vllm/model_executor/custom_op.py
- Added __new__method toCustomOpto enable dynamic instantiation from theop_registry(lines 19-23).
- Modified the registerclass method decorator to acceptcustom_op=Trueand allow overriding existing registered ops (lines 150-161).
 
- Added 
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description | 
|---|---|---|
| Code Review | /gemini review | Performs a code review for the current pull request in its current state. | 
| Pull Request Summary | /gemini summary | Provides a summary of the current pull request in its current state. | 
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. | 
| Help | /gemini help | Displays a list of available commands. | 
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
A decorator, a clever wrap,
Changes how functions clap.
With@register's might,
Ops take new flight,
No more monkey patch trap.
Footnotes
- 
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩ 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant improvement to the CustomOp registration and instantiation mechanism, enabling plugins to override existing operators by leveraging the op_registry. The changes in __new__ and the register decorator are central to this enhancement.
While the direction is good, there are a few critical and high-severity issues in the register method's logic that could lead to unexpected behavior or errors, particularly concerning how overriding classes are handled and how name collisions are managed when custom_op=False. Additionally, the __new__ method could be made more robust in handling unregistered classes.
Addressing these points will greatly improve the reliability and predictability of the custom operator system.
Summary of Findings
- Robustness of __new__method: The__new__method inCustomOpcould lead to an unhandledAttributeErrorifcls.nameis not defined. Improved error handling is recommended.
- Missing nameattribute setting inregister: When overriding an operator usingcustom_op=Truein theregisterdecorator, thenameattribute is not set on the overriding class, which can cause issues during its instantiation.
- Incorrect return value from registerdecorator: Theregisterdecorator may return a different class than the one being decorated in cases of name collision withcustom_op=False. This violates decorator conventions and can lead to subtle bugs. It should consistently return the decorated class or raise an error on invalid registration attempts.
Merge Readiness
The pull request makes valuable changes to the custom operator framework. However, there are critical and high-severity issues identified in the register method's logic, along with a medium-severity robustness concern in the __new__ method. These issues should be addressed before merging to ensure the stability and predictability of the custom operator system. I am unable to approve this PR myself, and it should be reviewed and approved by others once these changes are made.
| @simon-mo @MengqingCao @aurickq , please take a review. | 
1190506    to
    92f0202      
    Compare
  
    | @mgoin @youkaichao @wangxiyuan , please also have a review on this PR. As part of RFC #19161 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your efforts!
This change is overall lgtm, but there is one small suggestion I have commented.
And could you add test and docstring for the usage?
d44b398    to
    1bb1ae9      
    Compare
  
    | @MengqingCao , I updated the register decorator arg_name and also add an UT for oot_custom_op, please review again, Thanks. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is reasonable to me, thanks for this. probably want to cc either @youkaichao or @houseroad for this?
| Also cc @zou3519 as well. | 
        
          
                tests/plugins/vllm_add_dummy_platform/vllm_add_dummy_platform/dummy_custom_ops.py
              
                Outdated
          
            Show resolved
            Hide resolved
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some nits about the API. CustomOp.register is used for registering a new class of custom operators. What we're doing here with the is_oot_custom_op flag is that is actually that whenever a RotaryEmbedding is initialized, it will actually be replaced with a DummyRotaryEmbedding.
It also sounds like you might want to replace special variants, like Llama4RotaryEmbedding with NPURotaryEmbedding. The first input to register is supposed to be the "base name" of a custom operator (like "RotaryEmbedding"), but it sounds like in your case you may want to use "Llama4RotaryEmbedding" instead.
If we're going with this design, I think there should be a separate API because it seems like something separate from CustomOp.register.
@RotaryEmbedding.monkeypatch  # or RotaryEmbedding.use_out_of_tree if you want it to be nicer.
class DummyRotaryEmbedding(RotaryEmbedding):
    passThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the seperate API will lead to bigger changes to existing vllm codes right? Currently, since most of the ops are already derived from custom_op, it is much smaller change in this PR to make things working.
And the main reason is also we want to avoid monkey patch, so we want to have a single entry which is CustomOp.
To clarify, we want do things like:
replace Llama4RotaryEmbedding  to HPULlama4RotaryEmbedding
replace RotaryEmbedding to HPURotaryEmbedding
And HPULlama4RotaryEmbedding is different to HPURotaryEmbedding in impl
here is the actual impl: https://github.com/HabanaAI/vllm-hpu-extension/tree/plugin_poc/vllm_hpu/ops
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will be the major impact if we reuse custom_op to register oot_op vs using the way you suggested by decorate on actual layer class?
I think it makes more sense to us by reusing custom_op, this prevent massive update to vllm.
If you think reuse 'custom_op.register' might bring confusion, I can add a new decorator in custom_op and using 'custom_op.register_oot' instead, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zou3519 , I've updated the PR based on your suggestion. Now either using CustomOP.register_oot('RotaryEmbedding') or RotaryEmbedding.register_oot will register custom class to CustomOp
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
fd4678e    to
    bf79f74      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, this looks reasonable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the change. It's fine
| @simon-mo , can you review the PR again, thanks | 
| @simon-mo is out this week, maybe @youkaichao or @WoosukKwon ? | 
Signed-off-by: nie3e <adrcwiek@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> added notebooks to playground updates remoted verbatim HF secrets from all files updates [custom_op][vllm-plugin] update custom_op class to use op_registry (vllm-project#19164) Signed-off-by: Chendi.Xue <chendi.xue@intel.com> Export NaNs in logits to scheduler_stats if output is corrupted (vllm-project#18777) Signed-off-by: Vlad Mihailescu <vtmihailescu@gmail.com> [CPU][CI] Fallback sliding window to v0 and fix CPU pooling model tests (vllm-project#19901) Signed-off-by: jiang1.li <jiang1.li@intel.com> [Kernel] mark TorchSDPABackend swap_blocks NotImplementedError (vllm-project#19749)
…llm-project#19164) Signed-off-by: Chendi.Xue <chendi.xue@intel.com> Signed-off-by: juncheoll <th6re8e@naver.com>
…llm-project#19164) Signed-off-by: Chendi.Xue <chendi.xue@intel.com> Signed-off-by: fhl <2410591650@qq.com>
Co-Authored with @kzawora-intel
Essential Elements of an Effective PR Description Checklist
Purpose
POC for #19161
Problem:
Solution:
Test Plan
Verfied by a poc hpu plugin
Test Result
Verified that HPUUnquantizedFusedMoEMethod in hpu plugin is called.